5 research outputs found

    Gamification Framework for Sensor Data Analytics

    Get PDF
    Data in all of its form is becoming a central part of our existence, it is being captured in every facets of our everyday life: social media, pictures, smartphones, wearable devices, smart building etc. One of the main drivers of this Big Data Revolution is the Internet of Things, which enables inert objects to communicate through a multitude of sensors. The data amassed fuels a thirst for information, the extraction of such knowledge is rendered possible through Data Analytics Techniques. However, when it comes to sensor data our large-scale ability to perform analytics is highly limited by the difficulties associated with collecting sensor data labels. Current crowdsourcing platforms historically used to gather labels are unable to process sensor data due to its low level nature. The solution proposed in this thesis enables the deployment of a crowdsourcing platform for sensor data. This research presents a novel solution to acquire sensor labels by leveraging the power of crowdsourcing using gamification. The work in this thesis describes not only a framework that facilitates the capture of sensor data label through a flexible gamification architecture but also a solution that outlines the mechanics required to integrate gamification in a variety of contexts. Additionally, the framework is designed in a flexible manner to support any type of sensor data given that human can readily interact with them. Additionally, the work presented describes and supports both real time and historical data analytics through the captured data and associated labels. This work was successfully evaluated in the context of a case study where the gamification implementation was tested for a number of electrical sensors. Real time and historical data analytics were successfully performed with the use of the framework. The robustness of the solution was evaluated though the injection of invalid data and the result showed that the framework is effectively capable of reducing the level of noise in the data labels

    Energy Forecasting for Event Venues: Big Data and Prediction Accuracy

    Get PDF
    Advances in sensor technologies and the proliferation of smart meters have resulted in an explosion of energy-related data sets. These Big Data have created opportunities for development of new energy services and a promise of better energy management and conservation. Sensor-based energy forecasting has been researched in the context of office buildings, schools, and residential buildings. This paper investigates sensor-based forecasting in the context of event-organizing venues, which present an especially difficult scenario due to large variations in consumption caused by the hosted events. Moreover, the significance of the data set size, specifically the impact of temporal granularity, on energy prediction accuracy is explored. Two machine-learning approaches, neural networks (NN) and support vector regression (SVR), were considered together with three data granularities: daily, hourly, and 15 minutes. The approach has been applied to a large entertainment venue located in Ontario, Canada. Daily data intervals resulted in higher consumption prediction accuracy than hourly or 15-min readings, which can be explained by the inability of the hourly and 15-min models to capture random variations. With daily data, the NN model achieved better accuracy than the SVR; however, with hourly and 15-min data, there was no definitive dominance of one approach over another. Accuracy of daily peak demand prediction was significantly higher than accuracy of consumption prediction

    A Gamification Framework for Sensor Data Analytics

    Get PDF
    The Internet of Things (IoT) enables connected objects to capture, communicate, and collect information over the network through a multitude of sensors, setting the foundation for applications such as smart grids, smart cars, and smart cities. In this context, large scale analytics is needed to extract knowledge and value from the data produced by these sensors. The ability to perform analytics on these data, however, is highly limited by the difficulties of collecting labels. Indeed, the machine learning techniques used to perform analytics rely upon data labels to learn and to validate results. Historically, crowdsourcing platforms have been used to gather labels, yet they cannot be directly used in the IoT because of poor human readability of sensor data. To overcome these limitations, this paper proposes a framework for sensor data analytics which leverages the power of crowdsourcing through gamification to acquire sensor data labels. The framework uses gamification as a socially engaging vehicle and as a way to motivate users to participate in various labelling tasks. To demonstrate the framework proposed, a case study is also presented. Evaluation results show the framework can successfully translate gamification events into sensor data labels

    Challenges for MapReduce in Big Data

    Get PDF
    In the Big Data community, MapReduce has been seen as one of the key enabling approaches for meeting continuously increasing demands on computing resources imposed by massive data sets. The reason for this is the high scalability of the MapReduce paradigm which allows for massively parallel and distributed execution over a large number of computing nodes. This paper identifies MapReduce issues and challenges in handling Big Data with the objective of providing an overview of the field, facilitating better planning and management of Big Data projects, and identifying opportunities for future research in this field. The identified challenges are grouped into four main categories corresponding to Big Data tasks types: data storage (relational databases and NoSQL stores), Big Data analytics (machine learning and interactive analytics), online processing, and security and privacy. Moreover, current efforts aimed at improving and extending MapReduce to address identified challenges are presented. Consequently, by identifying issues and challenges MapReduce faces when handling Big Data, this study encourages future Big Data research

    Machine Learning with Big Data for Electrical Load Forecasting

    No full text
    Today, the amount of data collected is exploding at an unprecedented rate due to developments in Web technologies, social media, mobile and sensing devices and the internet of things (IoT). Data is gathered in every aspect of our lives: from financial information to smart home devices and everything in between. The driving force behind these extensive data collections is the promise of increased knowledge. Therefore, the potential of Big Data relies on our ability to extract value from these massive data sets. Machine learning is central to this quest because of its ability to learn from data and provide data-driven insights, decisions, and predictions. However, traditional machine learning approaches were developed in a different era and thus are based upon multiple assumptions that unfortunately no longer hold true in the context of Big Data. This thesis presents the challenges associated with performing machine learning on Big Data and highlights the cause-effect relationship between the defining dimensions of Big Data and the applications of machine learning techniques. Additionally, emerging machine learning paradigms and how they can handle the challenges are identified. Although many areas of research and applications are affected by these challenges, this thesis focuses on tackling those associated with electrical load forecasting. Consequently, two of the identified challenges are addressed. Firstly, an adaptation of the transformer architecture for electrical load forecasting is proposed in order to address the training time performance-related challenge associated with deep learning algorithms. The result showed improved accuracy for various forecasting horizons over the current state-of-the-art algorithm and addressed performance shortcomings through the architecture’s ability to be parallelized. Secondly, a transfer learning algorithm is proposed to scale the learning of load forecasting tasks and effectively address the performance challenges associated with transfer learning. Additionally, the diversity of the data was examined to analyze the portability of the results. In spite of facing various data distributions, the learned concepts and results were repeatable over multiple streams. The results showed significant improvements to machine learning model training time, where the scaled models were 1.7 times faster on average leading to much more efficient model deployment times
    corecore